Improving Probabilistic Record Linkage Using Statistical Prediction Models

نویسندگان

چکیده

Summary Record linkage brings together information from records in two or more data sources that are believed to belong the same statistical unit based on a common set of matching variables. Matching variables, however, can appear with errors and variations challenge is link units subject error. We provide an overview record techniques specifically investigate classic Fellegi Sunter probabilistic framework assess whether decision rule for classifying pairs into sets matches non‐matches be improved by incorporating prediction model. also study enhanced better results terms preserving associations between variables linked file not used procedure. A simulation application real evaluate methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic Linkage of Persian Record with Missing Data

Extended Abstract. When the comprehensive information about a topic is scattered among two or more data sets, using only one of those data sets would lead to information loss available in other data sets. Hence, it is necessary to integrate scattered information to a comprehensive unique data set. On the other hand, sometimes we are interested in recognition of duplications in a data set. The i...

متن کامل

Probabilistic record linkage

Studies involving the use of probabilistic record linkage are becoming increasingly common. However, the methods underpinning probabilistic record linkage are not widely taught or understood, and therefore these studies can appear to be a 'black box' research tool. In this article, we aim to describe the process of probabilistic record linkage through a simple exemplar. We first introduce the c...

متن کامل

Improving Temporal Record Linkage Using Regression Classification

Temporal record linkage is the process of identifying groups of records that are collected over a period of time, such as in census or voter registration databases, where records in the same group represent the same real-world entity. Such databases often contain temporal information, such as the time when a record was created or when it was modified. Unlike traditional record linkage, which co...

متن کامل

Validating Distance-Based Record Linkage with Probabilistic Record Linkage

This work compares two alternative methods for record linkage: distance based and probabilistic record linkage. It compares the performance of both approaches when data is categorical. To that end, a distance over ordinal and nominal scales is defined. The paper shows that, for categorical data, distance-based and probabilistic-based record linkage lead to similar results in relation to the num...

متن کامل

Improving Record Linkage through Pedigrees

IMPROVING RECORD LINKAGE THROUGH PEDIGREES

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Statistical Review

سال: 2022

ISSN: ['0306-7734', '1751-5823']

DOI: https://doi.org/10.1111/insr.12535